{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E7qE-VXQKnWW"
      },
      "source": [
        "# Self-Reflecting Gift Agent with Haystack and MongoDB Atlas\n",
        "This notebook demonstrates how to build a self-reflecting gift selection agent using [Haystack](https://haystack.deepset.ai/) and MongoDB Atlas!\n",
        "\n",
        "The agent will help optimize gift selections based on children's wishlists and budget constraints, using MongoDB Atlas vector search for semantic matching and implementing self-reflection to ensure the best possible gift combinations.\n",
        "\n",
        "**Components to use in this notebook:**\n",
        "- [`OpenAITextEmbedder`](https://docs.haystack.deepset.ai/docs/openaitextembedder) for  query embedding\n",
        "- [`MongoDBAtlasEmbeddingRetriever`](https://docs.haystack.deepset.ai/docs/) for finding relevant gifts\n",
        "- [`PromptBuilder`](https://docs.haystack.deepset.ai/docs/promptbuilder) for creating the prompt\n",
        "- [`OpenAIGenerator`](https://docs.haystack.deepset.ai/docs/openaigenerator) for  generating responses\n",
        "- Custom `GiftChecker` component for self-reflection\n",
        "\n",
        "### **Prerequisites**\n",
        "\n",
        "Before running this notebook, you'll need:\n",
        "\n",
        "* A MongoDB Atlas account and cluster\n",
        "* Python environment with `haystack-ai`, `mongodb-atlas-haystack` and other required packages\n",
        "* OpenAI API key for GPT-4 and `text-embedding-3-small` access"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "QMTHZTGJKnWX"
      },
      "outputs": [],
      "source": [
        "# Install required packages\n",
        "!pip install haystack-ai mongodb-atlas-haystack tiktoken datasets colorama"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Jsbb99NzKnWX"
      },
      "source": [
        "## Configure Environment\n",
        "\n",
        "* Create a free MongoDB Atlas account at https://www.mongodb.com/cloud/atlas/register\n",
        "* Create a new cluster (free tier is sufficient). Find more details in [this tutorial](https://www.mongodb.com/docs/guides/atlas/cluster/#create-a-cluster)\n",
        "* Create a database user with read/write permissions\n",
        "* Get your [connection string](https://www.mongodb.com/docs/atlas/tutorial/connect-to-your-cluster/#connect-to-your-atlas-cluster) from Atlas UI (Click \"Connect\" > \"Connect your application\")\n",
        "* Connection string should look like this `mongodb+srv://<db_username>:<db_password>@<clustername>.xxxxx.mongodb.net/?retryWrites=true...`. Replace `<db_password>` in the connection string with your database user's password\n",
        "* Enable network access from your IP address in the Network Access settings (have `0.0.0.0/0` address in your network access list).\n",
        "\n",
        "Set up your MongoDB Atlas and OpenAI credentials:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "yPQ6rWPWKnWX"
      },
      "outputs": [],
      "source": [
        "import getpass\n",
        "import os\n",
        "import re\n",
        "\n",
        "conn_str = getpass.getpass(\"Enter your MongoDB connection string:\")\n",
        "conn_str = (\n",
        "    re.sub(r\"appName=[^\\s]*\", \"appName=devrel.ai.haystack_partner\", conn_str)\n",
        "    if \"appName=\" in conn_str\n",
        "    else conn_str\n",
        "    + (\"&\" if \"?\" in conn_str else \"?\")\n",
        "    + \"appName=devrel.ai.haystack_partner\"\n",
        ")\n",
        "os.environ[\"MONGO_CONNECTION_STRING\"] = conn_str\n",
        "os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API Key:\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pioj7eg5KnWX"
      },
      "source": [
        "## Create Sample Gift Dataset\n",
        "\n",
        "Let's create a dataset of gifts with prices and categories:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zpYX2h6wKnWX"
      },
      "outputs": [],
      "source": [
        "dataset = {\n",
        "    \"train\": [\n",
        "        {\n",
        "            \"title\": \"LEGO Star Wars Set\",\n",
        "            \"price\": \"$49.99\",\n",
        "            \"description\": \"Build your own galaxy with this exciting LEGO Star Wars set\",\n",
        "            \"category\": \"Toys\",\n",
        "            \"age_range\": \"7-12\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Remote Control Car\",\n",
        "            \"price\": \"$29.99\",\n",
        "            \"description\": \"Fast and fun RC car with full directional control\",\n",
        "            \"category\": \"Toys\",\n",
        "            \"age_range\": \"6-10\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Art Set\",\n",
        "            \"price\": \"$24.99\",\n",
        "            \"description\": \"Complete art set with paints, brushes, and canvas\",\n",
        "            \"category\": \"Arts & Crafts\",\n",
        "            \"age_range\": \"5-15\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Science Kit\",\n",
        "            \"price\": \"$34.99\",\n",
        "            \"description\": \"Educational science experiments kit\",\n",
        "            \"category\": \"Educational\",\n",
        "            \"age_range\": \"8-14\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Dollhouse\",\n",
        "            \"price\": \"$89.99\",\n",
        "            \"description\": \"Beautiful wooden dollhouse with furniture\",\n",
        "            \"category\": \"Toys\",\n",
        "            \"age_range\": \"4-10\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Building Blocks Set\",\n",
        "            \"price\": \"$39.99\",\n",
        "            \"description\": \"Classic wooden building blocks in various shapes and colors\",\n",
        "            \"category\": \"Educational\",\n",
        "            \"age_range\": \"3-8\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Board Game Collection\",\n",
        "            \"price\": \"$44.99\",\n",
        "            \"description\": \"Set of 5 classic family board games\",\n",
        "            \"category\": \"Games\",\n",
        "            \"age_range\": \"6-99\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Puppet Theater\",\n",
        "            \"price\": \"$59.99\",\n",
        "            \"description\": \"Wooden puppet theater with 6 hand puppets\",\n",
        "            \"category\": \"Creative Play\",\n",
        "            \"age_range\": \"4-12\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Robot Building Kit\",\n",
        "            \"price\": \"$69.99\",\n",
        "            \"description\": \"Build and program your own robot with this STEM kit\",\n",
        "            \"category\": \"Educational\",\n",
        "            \"age_range\": \"10-16\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Play Kitchen\",\n",
        "            \"price\": \"$79.99\",\n",
        "            \"description\": \"Realistic play kitchen with sounds and accessories\",\n",
        "            \"category\": \"Pretend Play\",\n",
        "            \"age_range\": \"3-8\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Musical Instrument Set\",\n",
        "            \"price\": \"$45.99\",\n",
        "            \"description\": \"Collection of kid-friendly musical instruments\",\n",
        "            \"category\": \"Music\",\n",
        "            \"age_range\": \"3-10\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Sports Equipment Pack\",\n",
        "            \"price\": \"$54.99\",\n",
        "            \"description\": \"Complete set of kids' sports gear including ball, bat, and net\",\n",
        "            \"category\": \"Sports\",\n",
        "            \"age_range\": \"6-12\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Magic Tricks Kit\",\n",
        "            \"price\": \"$29.99\",\n",
        "            \"description\": \"Professional magic set with instruction manual\",\n",
        "            \"category\": \"Entertainment\",\n",
        "            \"age_range\": \"8-15\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Dinosaur Collection\",\n",
        "            \"price\": \"$39.99\",\n",
        "            \"description\": \"Set of 12 detailed dinosaur figures with fact cards\",\n",
        "            \"category\": \"Educational\",\n",
        "            \"age_range\": \"4-12\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Craft Supply Bundle\",\n",
        "            \"price\": \"$49.99\",\n",
        "            \"description\": \"Comprehensive craft supplies including beads, yarn, and tools\",\n",
        "            \"category\": \"Arts & Crafts\",\n",
        "            \"age_range\": \"6-16\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Coding for Kids Set\",\n",
        "            \"price\": \"$64.99\",\n",
        "            \"description\": \"Interactive coding kit with programmable robot and game cards\",\n",
        "            \"category\": \"STEM\",\n",
        "            \"age_range\": \"8-14\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Dress Up Trunk\",\n",
        "            \"price\": \"$49.99\",\n",
        "            \"description\": \"Collection of costumes and accessories for imaginative play\",\n",
        "            \"category\": \"Pretend Play\",\n",
        "            \"age_range\": \"3-10\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Microscope Kit\",\n",
        "            \"price\": \"$59.99\",\n",
        "            \"description\": \"Real working microscope with prepared slides and tools\",\n",
        "            \"category\": \"Science\",\n",
        "            \"age_range\": \"10-15\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Outdoor Explorer Kit\",\n",
        "            \"price\": \"$34.99\",\n",
        "            \"description\": \"Nature exploration set with binoculars, compass, and field guide\",\n",
        "            \"category\": \"Outdoor\",\n",
        "            \"age_range\": \"7-12\",\n",
        "        },\n",
        "        {\n",
        "            \"title\": \"Art Pottery Studio\",\n",
        "            \"price\": \"$69.99\",\n",
        "            \"description\": \"Complete pottery wheel set with clay and glazing materials\",\n",
        "            \"category\": \"Arts & Crafts\",\n",
        "            \"age_range\": \"8-16\",\n",
        "        },\n",
        "    ]\n",
        "}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OX6js4s4KnWX"
      },
      "source": [
        "## Initialize MongoDB Atlas\n",
        "\n",
        "First, we need to set up our MongoDB Atlas collection and create a vector search index. This step is crucial for enabling semantic search capabilities:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "lpySpbqbLOKw",
        "outputId": "9d55bba8-434c-434b-e4fb-ca3de6e33651"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "New search index named vector_index is building.\n",
            "Polling to check if the index is ready. This may take up to a minute.\n",
            "vector_index is ready for querying.\n"
          ]
        }
      ],
      "source": [
        "# Create collection gifts and add the vector index\n",
        "\n",
        "import time\n",
        "\n",
        "from bson import json_util\n",
        "from pymongo import MongoClient\n",
        "from pymongo.operations import SearchIndexModel\n",
        "\n",
        "client = MongoClient(\n",
        "    os.environ[\"MONGO_CONNECTION_STRING\"],\n",
        "    appname=\"devrel.showcase.haystack_gifting_agent\",\n",
        ")\n",
        "db = client[\"santa_workshop\"]\n",
        "collection = db[\"gifts\"]\n",
        "\n",
        "db.create_collection(\"gifts\")\n",
        "\n",
        "\n",
        "## create index\n",
        "search_index_model = SearchIndexModel(\n",
        "    definition={\n",
        "        \"fields\": [\n",
        "            {\n",
        "                \"type\": \"vector\",\n",
        "                \"numDimensions\": 1536,\n",
        "                \"path\": \"embedding\",\n",
        "                \"similarity\": \"cosine\",\n",
        "            },\n",
        "        ]\n",
        "    },\n",
        "    name=\"vector_index\",\n",
        "    type=\"vectorSearch\",\n",
        ")\n",
        "result = collection.create_search_index(model=search_index_model)\n",
        "print(\"New search index named \" + result + \" is building.\")\n",
        "# Wait for initial sync to complete\n",
        "print(\"Polling to check if the index is ready. This may take up to a minute.\")\n",
        "predicate = None\n",
        "if predicate is None:\n",
        "    predicate = lambda index: index.get(\"queryable\") is True\n",
        "while True:\n",
        "    indices = list(collection.list_search_indexes(result))\n",
        "    if len(indices) and predicate(indices[0]):\n",
        "        break\n",
        "    time.sleep(5)\n",
        "print(result + \" is ready for querying.\")\n",
        "client.close()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YQ4Kv8dpofGp"
      },
      "source": [
        "## Initialize Document Store and Index Documents\n",
        "\n",
        "Now let's set up the [MongoDBAtlasDocumentStore](https://docs.haystack.deepset.ai/docs/mongodbatlasdocumentstore) and index our gift data:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "d6bmEu1WKnWY",
        "outputId": "9465a8f8-d51e-4a54-ebf7-858fceaa4ade"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Calculating embeddings: 100%|██████████| 1/1 [00:00<00:00,  1.25it/s]\n"
          ]
        },
        {
          "data": {
            "text/plain": [
              "{'doc_embedder': {'meta': {'model': 'text-embedding-3-small',\n",
              "   'usage': {'prompt_tokens': 54, 'total_tokens': 54}}},\n",
              " 'doc_writer': {'documents_written': 5}}"
            ]
          },
          "execution_count": 10,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from bson import json_util\n",
        "from haystack import Document, Pipeline\n",
        "from haystack.components.embedders import OpenAIDocumentEmbedder\n",
        "from haystack.components.writers import DocumentWriter\n",
        "from haystack.document_stores.types import DuplicatePolicy\n",
        "from haystack_integrations.document_stores.mongodb_atlas import (\n",
        "    MongoDBAtlasDocumentStore,\n",
        ")\n",
        "\n",
        "# Initialize document store\n",
        "document_store = MongoDBAtlasDocumentStore(\n",
        "    database_name=\"santa_workshop\",\n",
        "    collection_name=\"gifts\",\n",
        "    vector_search_index=\"vector_index\",\n",
        ")\n",
        "\n",
        "# Convert dataset to documents\n",
        "insert_data = []\n",
        "for gift in dataset[\"train\"]:\n",
        "    doc_gift = json_util.loads(json_util.dumps(gift))\n",
        "    haystack_doc = Document(content=doc_gift[\"title\"], meta=doc_gift)\n",
        "    insert_data.append(haystack_doc)\n",
        "\n",
        "# Create indexing pipeline\n",
        "doc_writer = DocumentWriter(document_store=document_store, policy=DuplicatePolicy.SKIP)\n",
        "doc_embedder = OpenAIDocumentEmbedder(\n",
        "    model=\"text-embedding-3-small\", meta_fields_to_embed=[\"description\"]\n",
        ")\n",
        "\n",
        "indexing_pipe = Pipeline()\n",
        "indexing_pipe.add_component(instance=doc_embedder, name=\"doc_embedder\")\n",
        "indexing_pipe.add_component(instance=doc_writer, name=\"doc_writer\")\n",
        "indexing_pipe.connect(\"doc_embedder.documents\", \"doc_writer.documents\")\n",
        "\n",
        "# Index the documents\n",
        "indexing_pipe.run({\"doc_embedder\": {\"documents\": insert_data}})"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r2yjTDncKnWY"
      },
      "source": [
        "## Create Self-Reflecting Gift Selection Pipeline\n",
        "\n",
        "Now comes the fun part! Create a pipeline that can:\n",
        "1. Take a gift request query\n",
        "2. Find relevant gifts using vector search\n",
        "3. Self-reflect on selections to optimize for budget and preferences\n",
        "\n",
        "You need a custom `GiftChecker` component that can if the more optimizateion is required. Learn how to write your Haystack component in [Docs: Creating Custom Components](https://docs.haystack.deepset.ai/docs/custom-components)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "OMbXMUKWKnWY",
        "outputId": "5f68914e-b0e8-4804-83f7-10645c5c865a"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "<haystack.core.pipeline.pipeline.Pipeline object at 0x7d5853ba7160>\n",
              "🚅 Components\n",
              "  - text_embedder: OpenAITextEmbedder\n",
              "  - retriever: MongoDBAtlasEmbeddingRetriever\n",
              "  - prompt_builder: PromptBuilder\n",
              "  - checker: GiftChecker\n",
              "  - llm: OpenAIGenerator\n",
              "🛤️ Connections\n",
              "  - text_embedder.embedding -> retriever.query_embedding (List[float])\n",
              "  - retriever.documents -> prompt_builder.documents (List[Document])\n",
              "  - prompt_builder.prompt -> llm.prompt (str)\n",
              "  - checker.gifts_to_check -> prompt_builder.gifts_to_check (str)\n",
              "  - llm.replies -> checker.replies (List[str])"
            ]
          },
          "execution_count": 14,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "from typing import List\n",
        "\n",
        "from colorama import Fore\n",
        "from haystack import component\n",
        "from haystack.components.builders.prompt_builder import PromptBuilder\n",
        "from haystack.components.embedders import OpenAITextEmbedder\n",
        "from haystack.components.generators import OpenAIGenerator\n",
        "from haystack_integrations.components.retrievers.mongodb_atlas import (\n",
        "    MongoDBAtlasEmbeddingRetriever,\n",
        ")\n",
        "\n",
        "\n",
        "@component\n",
        "class GiftChecker:\n",
        "    @component.output_types(gifts_to_check=str, gifts=str)\n",
        "    def run(self, replies: List[str]):\n",
        "        if \"DONE\" in replies[0]:\n",
        "            return {\"gifts\": replies[0].replace(\"DONE\", \"\")}\n",
        "        else:\n",
        "            print(Fore.RED + \"Not optimized yet, could find better gift combinations\")\n",
        "            return {\"gifts_to_check\": replies[0]}\n",
        "\n",
        "\n",
        "# Create prompt template\n",
        "prompt_template = \"\"\"\n",
        "    You are Santa's gift selection assistant . Below you have a list of available gifts with their prices.\n",
        "    Based on the child's wishlist and budget, suggest appropriate gifts that maximize joy while staying within budget.\n",
        "\n",
        "    Available Gifts:\n",
        "    {% for doc in documents %}\n",
        "        Gift: {{ doc.content }}\n",
        "        Price: {{ doc.meta['price']}}\n",
        "        Age Range: {{ doc.meta['age_range']}}\n",
        "    {% endfor %}\n",
        "\n",
        "    Query: {{query}}\n",
        "    {% if gifts_to_check %}\n",
        "        Previous gift selection: {{gifts_to_check[0]}}\n",
        "        Can we optimize this selection for better value within budget?\n",
        "        If optimal, say 'DONE' and return the selection\n",
        "        If not, suggest a better combination\n",
        "    {% endif %}\n",
        "\n",
        "    Gift Selection:\n",
        "\"\"\"\n",
        "\n",
        "# Create the pipeline\n",
        "gift_pipeline = Pipeline(max_runs_per_component=5)\n",
        "gift_pipeline.add_component(\n",
        "    \"text_embedder\", OpenAITextEmbedder(model=\"text-embedding-3-small\")\n",
        ")\n",
        "gift_pipeline.add_component(\n",
        "    instance=MongoDBAtlasEmbeddingRetriever(document_store=document_store, top_k=5),\n",
        "    name=\"retriever\",\n",
        ")\n",
        "gift_pipeline.add_component(\n",
        "    instance=PromptBuilder(template=prompt_template), name=\"prompt_builder\"\n",
        ")\n",
        "gift_pipeline.add_component(instance=GiftChecker(), name=\"checker\")\n",
        "gift_pipeline.add_component(instance=OpenAIGenerator(model=\"gpt-4\"), name=\"llm\")\n",
        "\n",
        "# Connect components\n",
        "gift_pipeline.connect(\"text_embedder.embedding\", \"retriever.query_embedding\")\n",
        "gift_pipeline.connect(\"retriever.documents\", \"prompt_builder.documents\")\n",
        "gift_pipeline.connect(\"checker.gifts_to_check\", \"prompt_builder.gifts_to_check\")\n",
        "gift_pipeline.connect(\"prompt_builder\", \"llm\")\n",
        "gift_pipeline.connect(\"llm\", \"checker\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9Vmg-VghKnWY"
      },
      "source": [
        "## Test Your Gift Selection Agent\n",
        "\n",
        "Let's test our pipeline with a sample query:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "3-qDYOeiKnWY",
        "outputId": "bcb2afe5-5e35-4882-f1dd-257bae9b7789"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\u001b[31mNot optimized yet, could find better gift combinations\n",
            "\u001b[32mScience Kit, LEGO Star Wars Set\n",
            "    Total cost: $84.98\n",
            "    This selection is under budget and suits the child's interest in science and building things.\n",
            "    So, Santa says, \"\"!\n"
          ]
        }
      ],
      "source": [
        "# query = \"Need gifts for a creative 6-year-old interested in art. Budget: $50\"\n",
        "# query = \"Looking for educational toys for a 12-year-old. Budget: $75\"\n",
        "query = (\n",
        "    \"Find gifts for a 9-year-old who loves science and building things. Budget: $100\"\n",
        ")\n",
        "\n",
        "result = gift_pipeline.run(\n",
        "    {\"text_embedder\": {\"text\": query}, \"prompt_builder\": {\"query\": query}}\n",
        ")\n",
        "\n",
        "print(Fore.GREEN + result[\"checker\"][\"gifts\"])"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    },
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "state": {}
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
